Path: blob/master/Part 7 - Natural Language Processing/[R] Natural Language Processing.ipynb
1002 views
Kernel: R
Natural Language Processing
Data Preprocessing
In [1]:
In [2]:
Out[2]:
In [3]:
Out[3]:
Cleaning the texts
In [4]:
Out[4]:
Loading required package: NLP
In [5]:
Out[5]:
In [6]:
Out[6]:
In [7]:
In [8]:
Out[8]:
In [9]:
Out[9]:
In [10]:
In [11]:
Out[11]:
In [12]:
Out[12]:
In [13]:
In [14]:
Out[14]:
In [15]:
Out[15]:
In [16]:
In [17]:
Out[17]:
In [18]:
Out[18]:
In [19]:
Creating the Bag of Words model
In [20]:
In [21]:
Out[21]:
In [22]:
Out[22]:
<<DocumentTermMatrix (documents: 1000, terms: 1577)>>
Non-/sparse entries: 5435/1571565
Sparsity : 100%
Maximal term length: 32
Weighting : term frequency (tf)
In [23]:
In [24]:
Out[24]:
<<DocumentTermMatrix (documents: 1000, terms: 691)>>
Non-/sparse entries: 4549/686451
Sparsity : 99%
Maximal term length: 12
Weighting : term frequency (tf)
In [25]:
In [26]:
In [27]:
Out[27]:
y_pred
0 1
0 9 91
1 7 93
In [28]:
Out[28]:
randomForest 4.6-12
Type rfNews() to see new features/changes/bug fixes.
In [29]:
Out[29]:
y_pred
0 1
0 76 24
1 28 72
In [ ]: